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Abstract 


The Tianlai cylinder pathfinder is a radio interferometer array to test 21 cm intensity mapping techniques in the 
post-reionization era. It works in passive drift scan mode to survey the sky visible in the northern hemisphere. To 
deal with the large instantaneous field of view and the spherical sky, we decompose the drift scan data into m- 
modes, which are linearly related to the sky intensity. The sky map is reconstructed by solving the linear 
interferometer equations. Due to incomplete uv coverage of the interferometer baselines, this inverse problem is 
usually ill-posed, and regularization method is needed for its solution. In this paper, we use simulation to 
investigate two frequently used regularization methods, the Truncated Singular Value Decomposition (TSVD), and 
the Tikhonov regularization techniques. Choosing the regularization parameter is very important for its application. 
We employ the generalized cross validation method and the L-curve method to determine the optimal value. We 
compare the resulting maps obtained with the different regularization methods, and for the different parameters 
derived using the different criteria. While both methods can yield good maps for a range of regularization 
parameters, in the Tikhonov method the suppression of noisy modes are more gradually applied, produce more 
smooth maps which avoids some visual artefacts in the maps generated with the TSVD method. 
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1. Introduction 


The Tianlai experiment is designed to develop and test the 
HIIntensity Mapping (IM) technique (Chen 2012; Li et al. 
2020), in which the large scale distribution of neutral hydrogen 
is observed at low angular resolutions without resolving 
individual galaxies. This allows fast survey speed to cover 
the large volume required for cosmological studies (Chang 
et al. 2008). This technique has been applied to the observation 
of the Green Bank Telescope (GBT) and the Parkes telescope 
(Chang et al. 2010; Masui et al. 2013; Switzer et al. 2013; Wolz 
et al. 2017; Anderson et al. 2018; Wolz et al. 2021). Other 
existing or ongoing HIIM experiments focusing on the late- 
time cosmology include both single-dish telescopes and 
interferometers, such as FAST (Hu et al. 2020), BINGO 
(Battye et al. 2013), CHIME (CHIME Collaboration et al. 
2022), and SKA in the near future (Square Kilometre Array 
Cosmology Science Working Group et al. 2020). However, 
HIIM observation is very challenging due to the strong 
foreground radiation that is 4-5 orders of magnitude brighter 
than the cosmological signal. Additionally, the instrumental 
systematics and other observational effects also complicate the 


separation of the two, a highly sophisticated analysis is 
required to extract this signal (Liu & Shaw 2020). 

The Tianlai experiment includes a cylinder array and a dish 
array. The Tianlai cylinder pathfinder array is fixed on the 
ground and relies on the rotation of the Earth to survey the sky. 
It consists of three adjacent 15 m x 40 m cylindrical reflectors, 
on which there are 31, 32, 33 feeds equipped from east to west, 
respectively. The basic performance of the Tianlai arrays has 
been analyzed in Li et al. (2020) for the cylinder pathfinder, 
and in Wu et al. (2021) for the dish pathfinder. Although 
making synthesis images of the sky from the interferometric 
raw data is strictly speaking not needed for the power spectrum 
estimation, in practice it is still an essential procedure to 
compress the data for further scientific analysis and to provide 
an intuitive means of checking the data quality and algorithms 
applied. 

The output of an interferometer, the interferometric visibi- 
lity, is the cross-correlation between the voltage of two feed 
elements. For the unpolarized case, the visibility is given by 


YO = [TMA DARA De™PA, (1) 
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m f T (A) B,(n; t)d2A. (2) 


where T(f) is the sky brightness temperature, ñ gives the 
direction on the sky, A; is the beam of feed i, uj; = (r; — rj) /Xis 
the separation between feeds in the unit of observed 
wavelength. In the second line, we introduce the beam transfer 
function B;;. In discrete form, the integral is replaced by a sum 
over pixelated sky indexed by n; as follows 


V(t) = D2 Tn) Band AQ, (3) 


where AQ is the pixel angular size, or in a matrix-vector form, 
V= BT. (4) 


Synthesis imaging is to estimate T(#) given the measurements 
V,; In the flat sky limit, the measured visibilities of the 
interferometer array correspond to Fourier modes of sky 
intensity variation, with the orientation and angular scale 
determined by the direction and length of the baselines. 
However, in practice the baselines of an array often do not form 
a complete coverage of the spatial frequency domain, so 
reconstructing the sky map from the visibilities is not trivial, as 
it is mathematically an ill-posed inverse problem, with no 
unique solution. To overcome the gap of measured visibilities 
on the uv plane, a number of techniques have been developed 
to reconstruct the image in such cases, e.g., the variants of the 
CLEAN algorithm (Hégbom 1974), and some burgeoning 
methods based on neural networks (Xu et al. 2020; Connor 
et al. 2022; Schmidt et al. 2022). 

The general principle of sky image reconstruction for Tianlai 
Array was investigated in Zhang et al. (2016a, 2016b). Zuo 
et al. (2021) developed a pipeline for the map-making process 
for the Tianlai array. In a recent paper (Yu et al. 2023, 
heretofore referred to as paper I), we presented a simulation of 
the Tianlai cylinder pathfinder array up to the making of 
synthesis map, taken into account both thermal noise and 
calibration error. This simulation reveals that although the 
Tianlai array is a relatively compact and dense array, there are 
still incompleteness in its baseline coverage, and as a result, the 
reconstructed image has some artifacts arising from the 
incomplete uv coverage or beam side lobes. In the present 
paper, we will investigate regularization methods for dealing 
with the ill-posed inverse problem. 

The structure of this paper is as follows. In Section 2, we 
present our simulation setup and give a brief summary of the 
m-mode formalism for map-making. In Section 3, we apply the 
regularized m-mode formalism imaging to the simulation data 
and explore the choice of appropriate regularization para- 
meters. In Section 4, we discuss the errors in these regularized 
maps, and finally in Section 5, we present our conclusion. 
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2. Map-making with m-modes 
2.1. The m-mode Formalism 


Although imaging the sky by solving for T in Equation (4) 
directly is intuitive, the amount of computation is very large. If 
the interferometer has N, baselines and has run for Mime 
rounds, for each single frequency and Npix discrete sky pixels, 
the visibility vector V has a size of Ny X Mime, the dimensions 
of beam transfer matrix B will be (Nui X Mime, Npix). Zheng 
et al. (2017) demonstrated this method for a small array. For 
Tianlai cylinder pathfinder, the number of non-redundant 
baselines is N,) +3300, and the time samples are Mime 
21 600 in a sidereal day for 4 s integration time. If we pixelate 
the whole sky within latitude [—30°, 90°] in HEALPix scheme 
with N.ide = 256 (Gorski et al. 2005), the pixel number is 
Npix © 6 X 10°. Then the dimensions of the transfer matrix B 
would be 71M x 0.6M per frequency. Apparently, solving 
Equation (4) directly would be intractable for a large array, due 
to the large matrix involved. 

The m-mode method (Shaw et al. 2014, 2015; Zhang et al. 
2016b) provides a convenient and computationally efficient 
approach for the data processing and analysis of drift scan 
telescopes, especially wide-field telescopes. It has been applied 
in the data analysis of LWA (Eastwood et al. 2018), EDA2 
(Kriele et al. 2022) and CHIME (CHIME Collaboration et al. 
2022). We also have implemented this method in the map- 
making procedure of Tianlai data processing pipeline tlpipe 
(Zuo et al. 2021). 

As the drift scan telescope measures the sky periodically 
with the rotation of the Earth, we can decompose T (ñ) and 
B,(#; o) in spherical harmonics, 


B,(A; p) =Y BË YA) 


lm 


T (ñ) = 5 im Yim(@), 


lm 


the Fourier transform of the visibility V;(¢) can be written as 
vi= f voem 
2m ` 
d ij —imo 
=D f| EB ame’, (5) 
Im! 27 


and take the rotated beam transfer function BË ($)= 
BË ($ = O)e’"®, Equation (5) adding up the noise term 
becomes 


Vi = >> Bi am + në, (6) 
l 


which gives the key equation in m-mode analysis, we can 
rewrite it in matrix form as 


Ona,t) = Bm, dO Am) + andi, (7) 
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The label (ij, +) indicates that the positive and negative values 
of m are grouped together for the fact that the positive and 
negative m-modes are measurements of the same real-valued 
sky, which gives a, —m—=(—1)"a@im. We can rewrite 
Equation (6) in a general form as 


v = Ba + n, (8) 


where matrix B has a block diagonal structure, so we can treat 
each m-block independently. Then the Fourier transform of the 
visibility v is related to the spherical harmonics coefficients of 
the sky a by a simple linear equation. The imaging process is to 
solve for a given the observation v with prior information about 
B and available n. Compared with directly solving for T(n;) in 
Equation (3), the m-mode method handles matrix operations 
with much smaller matrices, which increases the computational 
efficiency tremendously. The main portion of computation is 
devoted to the transfer matrix B. However, once computed and 
stored, it can be reused in subsequent processing. 


2.2. Simulation 


We then simulate the map-making process, here the 
unpolarized case with a single frequency (750 MHz) is 
considered. We use the publicly available software cora® 
(Shaw et al. 2014, 2015) to generate our mock sky model, which 
contains foreground including diffuse emission from the Galaxy 
and extra-galactic point sources, alongside the cosmological 
21cm signal. The diffuse emission is generated by extrapolating 
Haslam map with a specified spectral index map, and including 
random spectral and angular fluctuations. The extra-galactic 
point sources consist of a catalog of real bright sources from 
NVSS and VLSS, a synthetic catalog of fainter sources, and a 
random background for even fainter sources. The cosmological 
21cm signal is generated by drawing Gaussian realizations of a 
power spectrum (Shaw et al. 2014, 2015). 

We model the beam pattern, which is characterized as a long 
strip along the North-South direction, in the following 
analytical form (see paper I for details), with parameters 
determined from a fit to the electromagnetic simulation of 
Tianlai cylinder array (Sun et al. 2022): 


A(f) = Ans(sin™ (Â - £); Ons) x Agw (sin™!(û - f), F, Ogw), 


where ê and y are the unit vector pointing East and North, 
respectively, and 


2 
Ays(@; v) = exp [~a zt Ja | (9) 


4nFWu 


a 
Agw (0; F, Ogw) x f” Ap(2tan™!u; Ogw)e s Si" du, 
~~ aF 


(10) 


© https: / /github.com /radiocosmology /cora 
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where Oys(v) = a-~, in which Dys = 0.3 m as the size of the 
NS 

Tianlai cylinder feeds, and Ap is taken as 

In2 tan? 0 | 


A 6; 0 = ex car ea 
p(O; Orwum) P | 2 tan?(Opwum/2) 


We use the mean of the power beam of X and Y polarization at 
750 MHz from Sun et al. (2022) to fit the models above. The 
fitting parameters are a = 1.04, F = 0.2, Qgw = 2.74. 

The noise in each m-mode (Shaw et al. 2015) is generated by 
sampling a Gaussian distribution with rms 


Qi; Tos Ts ' 
= Ni i sne( =) (11) 


Naay Nred tsid Av 


ï 
tsid 


where T; and T;** are the system temperature for feed i and j, 
Qy = JQ; is the geometric mean of individual beam solid 


angle, in which Q; = f a’njA;(a)/?, Naay is the number of 
observation days. N,eq is the number of redundant baseline, tsia 
is the sidereal day in second, Av is the bandwidth, tint is the 
single integration time, for the unpolarized case, ô; = by / V2. 
Below we assume all feeds have the same system temperature 
and primary beam profile. We take the system temperatures as 
90 K for all feeds from the estimation in Li et al. (2020), 
tin = 4s and Av=122kHz for the current configuration of 
Tianlai cylinder pathfinder, and take Naay = 1 for a single 
sidereal day observation, or 30 as an imitation for multiple 
sidereal days stacking which increases the signal-to-noise ratio. 


2.3. Beam Coverage in Spherical Harmonics Space 


The angular resolution and sensitivity on the sky of a 
telescope is limited by its size and configuration. We take the 
maximum spherical harmonics degree / and order m that 
Tianlai cylinder pathfinder sensitive to at 750 MHz as 


Imax = 27Dmax/ 740, max = 27Dpw/d x 715, 


where Dmax is the maximum dimension of the entire array, Dew 
is the physical size in the East-West direction. 

We can define the spherical harmonic beam coverage of 
array by combining the beam transfer matrix from all baselines 
as 


N 
D |Biimil 


Biim] = moh > (12) 
max()> |Briml) 


n=1 


where N is the total number of non-redundant baselines. This 
quantity can be used to describe the sensitivity of array on the 
sky in spherical harmonic space, analogy to the uv coverage, 
which is used to define the spatial frequencies measured by the 
array. In Figure 1, we show the beam coverage Bm; of the 
Tianlai cylinder pathfinder array. 
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Figure 1. The spherical harmonic beam coverage By, of the Tianlai cylinder 
pathfinder array. The three vertical dotted lines indicate the m values 
corresponding to the width of a single cylinder (15 m), the maximum baseline 
projected in the East-West direction (30 m), and the total width of the array 
(45 m). 


The resolution in the East-West direction of a baseline is 
determined by its projected distance along this direction. The 
limit on m for a telescope is m < 27Dgwcos6/X by taking 
into account that the m mode corresponds to the Fourier mode 
in the azimuthal direction, where 6 is the latitude of the Tianlai 
site about 44°, thus m < 510 with Dew = 45, A ~ 0.3997 at 750 
MHz. The beam coverage is expected to center at 
(l, m) = QrD/X, 27Dgwcos6/X). Owing to the massive 
isometric short baselines on the identical cylinder, we expect to 
see two areas in the spherical harmonic space where the Tianlai 
cylinder pathfinder has higher sensitivity, which can be 
estimated by 


(leenters Meente) © (204 D2 + b? /A, 2D, cos 6/2), 


with D,, equals 1 or 2 times the cylinder width, and b = 0.4 m, 
these are at (J, m) ~ (235, 170) and (470, 340), which is indeed 
the case as can be seen in Figure 1. 

Roughly, the sensitivity of our model telescope is relatively high 
in the range of about [m, m + 200] at m<400, and decreases 
significantly at and outside the edge region (i.e., m ~ 510) that the 
telescope can reach. Consequently, when reconstructing the sky 
map from the spherical harmonic coefficient a,,, in this paper, we 
filtered out all modes / > 600, as well as the m = 0 modes which 
measure the average over sidereal time. 


2.4. Linear Least-squares Map-Maker 


The imaging procedures can be seen as trying to solve the 
solution of Equation (8) for the spherical harmonic coefficients 
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Figure 2. The singular values of matrix B for several m values. 


a given the measurement v and the beam matrix B. The least- 
squares solution to it, which minimizes ||v — Bal|3, is given by 


as = (B*BY'B*v, (13) 


where * denotes the conjugate transpose. 

In our current work, before making the general linear least- 
squares solution, we first subtract the contributions of four very 
strong radio sources to the visibilities, i.e., Cas A, Cyg A, Tau 
A, and Vir A, which can be treated as point sources at known 
locations for the current Tianlai cylinder array pathfinder. This 
reduces the dynamic range of the input map. If they are not 
subtracted, the reconstructed map will show some apparent 
sidelobes around their positions. 

The matrix B usually does not have full rank, as there are 
unmeasured modes due to incomplete uv-coverage and 
incomplete coverage of the full sky. In Figure 2, we plot the 
singular values of several B, a cluster of small singular values 
approaching zero results in these matrices being rank-deficient. 
Hence B*B is not invertible, which makes the problem ill- 
posed. The solution to ill-posed inverse problems is usually 
unstable, which may deviate greatly from the true one in the 
presence of noise. 


3. Regularization 


A common approach to ill-posed problems is regularization 
(see, e.g., Engl et al. 1996; Hansen 1998), which gives a stable 
approximate solution by imposing additional constraints on it. 
The common regularization methods include Truncated Singular 
Value Decomposition (TSVD) and Tikhonov regularization. 


3.1. Truncated Singular Value Decomposition 


For the least-squares solution (Equation (13)), the matrix B 
generally does not have full rank, B*B is then singular and not 
invertible. With the singular values decomposition of the m x n 
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Figure 3. The reconstructed sky maps with different truncation thresholds for the data with a noise level corresponding to 30 days integration time. From left to right, 
e=1 x 1074, 1 x 107, and 5 x 107°. The four brightest point sources (i.e., Cas A, Cyg A, Tau A and Vir A) have been removed in the visibility before 


reconstruction. 


(m > n) matrix B, 


n 
B = UXV*=)_ ujaw, 
i=1 
where U = (uy, Ud, +: u,) and V= (v1, Vo, ++ Wn) are unitary 
matrices, and the diagonal matrix 4 = diag(o), 02, ++ ,o,) has 
non-negative elements in non-increasing order. The Moore- 
Penrose pseudo-inverse of matrix B is defined as 


Bt = V>"'U*, 


where + denotes the pseudo-inverse, the diagonal element of 
matrix X7! is the reciprocal of each non-zero element on the 
diagonal of X, leaving the zeros in place. The minimum-norm 
least-squares solution to Equation (8) in terms of the Moore— 
Penrose pseudo-inverse is then given by 


Gis = Btv = VD"'U*y 


r x 
=> “2y, (14) 
i=1 Oj 


where r=rank(B). For the noisy measurement v = vo +n, 
where vo = Bajue, this solution becomes 


r 


* r 
âs = yy y, (15) 


i=1 i i=1 i 


the first term gives the true solution, while the second term 
gives the contribution from noise. The latter might be 
magnified by those extremely small singular values o; of B, 
making the solution unstable and heavily contaminated by 
noise. 

Regularization is a technique to solve the problem above by 
dampening or filtering out those unwanted components 
corresponding to the small singular values, leading to an 
approximate but stable solution. In terms of the singular values 
decomposition of matrix B, the regularized solution to 
Equation (8) can be written as 


: T ušřv 
areg = SF —Vi, (16) 
{=l i 


where f; are the filter factors. Different regularization methods 
can be defined by choosing suitable filter factors. 


For the TSVD regularization, the filter factors are 


l GO 
TSVD i Z Ok 
fi o Oi < OK (17) 


the pseudo-inverse B* in Equation (14) is replaced by its rank- 
k approximation, with the remaining components filtered out. 
In paper I, we used the TSVD regularization to compute the 
regularized solution of Equation (8), where the truncation 
thresholds e satisfies oj.) < € x max(g;) < ok, and we 
selected e = 1 x 107° as the default. 

As an illustration, we show in Figure 3 the reconstructed maps 
with three truncation thresholds e, which can represent the case 
with a too small, moderate and too large €, respectively. Noise in 
the reconstructed map is significantly amplified if there is no 
regularization or with a too small threshold value €, the image is 
totally dominated by noise. For a too large e€, although noise is 
not significantly amplified, the reconstructed signal may fail to 
capture the true signal completely in certain modes, so again the 
resulting image is not optimal. For a properly chosen regulariza- 
tion value, we can see the reconstruction result is relatively good, 
with the input sky map well recovered. However, as we noted in 
paper I, even in this case, near the bright part of the Galactic 
plane, there are comb-like artifacts, which are produced by the 
incomplete reconstruction of certain modes which are truncated, 
especially those with small m values in our case. 


3.2. Tikhonov Regularization 


Another widely used regularization method is Tikhonov 
regularization. In contrast to TSVD regularization, which 
imposes a hard threshold for the components corresponding 
to those small singular values, Tikhonov regularization aims to 
suppress undesirable modes by solving for a that minimizes 


argmin{||v — Ba||} + XI|L(a — adl}, (18) 


where || - [|2 denotes the Frobenius norm, A is the regularization 
parameter, a is a prior of a, which can be set to ay = 0 if prior 
information is unavailable, and L is called the Tikhonov 
regularization matrix. There are several common choices for L, 
including the identity matrix or a discrete approximation of the 
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Figure 4. The reconstructed sky map with different Tikhonov regularization parameter À. From left to right, A? = 107°, 9 x 10-8, and 1075. 


derivative operator. This minimization problem gives solution 
â = (B*B + XI*LY !(B*v + XL*Lao). (19) 


In a simple case where L = J, the Tikhonov regularization is 
referred to as the standard form. The more general form of 
Tikhonov regularization, where L #7 and prior information 
about a and n is available, can be transformed into the standard 
form (Eldén 1977). For simplicity, we adopt L =I and ay = 0, 
then the objective function becomes 


argmin{||y — Ball + \’\la\3}, (20) 


and Equation (19) is reduced to 
Gx = (B*B + XI) 'B*v. (21) 


It can be expressed in the form of Equation (16) with filter 
factors as 


2 
ik 0; 1 o> 
ee ee . 22 
fi o +X f G<KA a 


Unlike the TSVD regularization which simply trims out the 
modes corresponding to small eigenvalues, the Tikhonov 
regularization makes a gradual and smooth suppression to all 
modes. The degree of smoothness can be adjusted by the 
choice of the regularization parameter À, a small regularization 
parameter value yields a solution which is closer to the one 
obtained from TSVD regularization. 

In Figure 4, we illustrate the Tikhonov regularized sky maps 
with different regularization parameter A. For a small A, as 
shown in the left panel, the noise in the regularized map is 
amplified obviously. The noise becomes less pronounced with 
a moderate À value, as shown in the center panel. The comb- 
like artifact around the Galactic Center shown in Figure 3 is 
also less prominent, perhaps thanks to the gradual suppression 
on the modes in the Tikhonov regularization. However, for the 
case with an excessively large A shown in the right panel as an 
example, the underlying signal is also suppressed, generating a 
more bland map than the actual sky. Furthermore, accompanied 
with the diminished actual sky signal, the sidelobes from some 
bright sources are more pronounced, e.g., an arc structure 
becomes noticeable near the north celestial pole and above the 
Cyg A. An appropriate selection of value, which strikes a 
balance between noise amplification and information loss, is 


expected to yield a satisfactory map that contains moderate 
noise while faithfully preserving the true sky as much as 
possible. 

The regularized solution error is given by 
u;\n 
l 


Atrue — areg = ya — fv (true) Vi = Df Vi, (23) 
i=1 


i 
i=1 Oi 
where the first term corresponds to the regularization error, 
arising from the regularization of true signal wue, and the 
second term represents the perturbation error, attributed to the 
presence of noise. The filter factors f; can be aes or ta For 
Tikhonov regularization, as A — 0, the filter factors he — I, 
the regularization error approaches zero, but the perturbation 
error might be large, the solution tends to be noisy. As A 
increases, the filter factors i — 0, leading to a smaller 
perturbation error but a larger regularization error, and the 
regularized solution is oversmoothed and approaches zero. The 
key to obtaining a good regularized solution is to choose an 
optimal regularization parameter À in Tikhonov regularization 
or k in TSVD regularization to balance the two errors. 


3.3. The Choice of Regularization Parameter 


Choosing the optimal regularization parameter that balances 
the trade-off between the noise amplification and signal 
recovery is not always easy, which depends on particular 
problems including factors such as the specific array config- 
uration, and the level of noise in the data. There are still several 
approaches available for choosing the regularization parameter 
that is near the optimal one. Below we adopt the notation used 
in Tikhonov regularization (i.e., take A as regularization 
parameter) for our description, these methods are also 
applicable to TSVD regularization. If the statistics of the noise 
is well known, can be chosen by applying the discrepancy 
principle (Engl 1987), such that the residue is at a comparable 
level to that of the noise, ||Ba, — v||2 < 7||n||2, where a, is the 
solution given by Equation (21) with the regularization 
parameter value A, and 7 is a user-specified constant to 
constrain the bound. If the error is unknown, techniques such as 
the generalized cross validation (Golub et al. 1979) or the 
L-curve criterion (Hansen & O’Leary 1993) are usually applied 
to search for the appropriate regularization parameter. 
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Figure 5. The GCV function as a function of A for m = 200 case. The red star 
marks the minimum point of the GCV function, the corresponding A gives the 
optimal regularization parameter by this method. 


Generalized Cross Validation (GCV)—The GCV method 
(Golub et al. 1979) is based on the following idea: for a linear 
vector equation Ax =b, if we drop a data point b’ from the 
vector b, and obtain a regularized solution xi from the rest of 
vector components A([{1l:i- 1, i+ 1: ml], :) 
xi ~ b([1: i — 1, i+ 1: m]), then the value A(i, :)xi should 
be close to the excluded value b’ if a reasonable parameter A is 
chosen (see Chapter 4 of Wahba 1990). This results in the 
selection of a parameter À that minimizes the GCV function 


|Bay — vI? 
(trace(I — BB#(X)))?’ 


where B#(\) = (B*B + XIy'B* is referred to as the 
regularized inverse, a, = B*y is the corresponding regularized 
solution. 

For illustration, in Figure 5 we show the logarithmic plot of 
the calculated GCV function applied to the data with 30 days 
integration noise for the case of m= 200. In our computation, 
the \ values vary from 10~'° to 107" and are sampled evenly 
on logarithmic scale. The GCV function decreases slowly as À 
increases, until at some point increases rapidly, and the 
minimum is reached before this rapid rise, which is marked 
by a red star in the figure. 

L-curve—In a log-log scale plot depicting the residual norm 
||Ba, — v||2 versus the solution norm ||a,||2, the resulting curve 
typically exhibits a L-shaped profile, which is comprised of a 
flat part where the regularization error dominates at large A, and 
a steep part where the perturbation error dominates at small A. 
The optimal A value should be chosen to balance these two 
errors, which corresponds to the corner of the L-curve, and can 
be identified by locating the maximum curvature of the curve 
(Hansen 1999). Let # = loglla l$, ô = log||Ba, — vll, then 


G(A) = (24) 


Yu et al. 


the curvature of the L-curve is given by 


eee phy” _ phy! . (25) 
(OF + Arre 

In Figure 6 we plot the L-curve (left) and the corresponding 

curvature «K (right) applied to data with the 30 days integration 

noise for the m= 200 case, the red star marks the optimal 

parameter À determined using this method. 

The optimal values for the regularization parameter as 
determined by the GCV criterion and the L-curve criterion for 
all m cases are shown in Figure 7. Comparing the optimal 
values determined by these two criteria, it is observed that the 
optimal À obtained from the L-curve criterion is typically larger 
than the one provided by the GCV criterion at the same m, 
especially for the case of higher noise (e.g., 1 day integration 
noise level). The optimal A values increase as m at m < 100, 
and then are relatively stable for the rest m. 

In Figure 8, we present a comparison between the input aj, 
and the Tikhonov regularized solution to the data with 30 days 
integration noise for the m= 10 (left) and m = 200 (right) cases 
as an example, where the regularization parameter are given by 
the two criteria. The real and imaginary parts are plotted in the 
top and bottom sub-figures, and the residues are plotted in the 
bottom panel of each sub-figure. As expected, a smaller 
regularization parameter obtained from the GCV criterion results 
in a solution that is slightly closer to the true value, especially at 
the smaller /s, but it may also amplify the noise for certain 
modes, as illustrated here in the region Z€ [200, 400] for the 
m= 10 case. While for the m= 200 case as shown, with the 
smaller regularization parameter provided by the GCV criterion, 
the residue is smaller without the cost of amplifying the noise. 

Moreover, we apply the L-curve criterion to choose the 
truncation threshold value € or k in Equation (17) for the TSVD 
regularization. we sample 100 values of e evenly on a 
logarithmic scale ranging from | x 1074 to 1 x 107" to compute 
the points on the TSVD L-curve (||Ba; — v||2, ||a,||2). Different 
from the case of Tikhonov regularization, the points on TSVD 
regularization L-curve are discrete, a quadratic spline interpola- 
tion is applied to these discrete values to compute the curvature 
of the L-curve and select the corresponding sampled e value 
closest to the maximum curvature. In Figure 9, we show results 
for the data with 30 days integration noise. For comparison, the 
case for fixed «= 1 x 107° is also plotted. We can see that at 
large m the two curves almost coincide, but at some relative 
smaller m the L-curve suggests a smaller k, indicating that more 
low sensitive modes are trimmed off to avoid amplifying noise. 

In the top panels of Figure 10 we illustrate the reconstructed 
maps with the automatically chosen regularization parameters 
for the data with 30 days noise level, and in the bottom panels 
their relative error, i.e., the fractional difference with the input 
map. We show the cases of Tikhonov regularization using the 
GCV (left) and the L-curve (middle) criteria, and the TSVD 
regularization using the L-curve criterion (right). 
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Figure 6. The L-curve (left) and the corresponding curvature (right) « as a function of À for m = 200 case. The red star marks the point of maximum curvature at the 


corner of the L-curve. 
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Figure 7. The optimal regularization parameters for all 0 < m < 600 with GCV and L-curve methods considering 1 day noise level (left) and 30 days noise level 


(right). 


It seems that the Tikhonov regularized solutions generally 
yield visually better maps of the sky without the comb-like 
feature in the TSVD regularized map. The errors are typically 
small except around places where bright sources are located. 
With the A value obtained by the GCV criterion, which is 
smaller than the one chosen with the L-curve criterion, the 
region near the Galactic Center is better reproduced, but the 
northern region tends to be noisier. With the larger À value 
chosen using the L-curve criterion, the map is more smooth. 
However, we note that the A value chosen by both methods 
would vary with the noise level, therefore in the present case 
the larger À value obtained from the L-curve criterion produces 


a better overall visual impression, this may change for a 
different noise level or set up. Owing to the flat tail of the GCV 
function curve (Figure 5), the A value determined from the 
minimum of the GCV function might not always be accurate 
and satisfactory, while the L-curve method gives a more robust 
result. In the reconstructed maps, the smaller regularization 
parameter obtained from the GCV criterion leads to a more 
noisy reconstructed map compared to the one obtained using 
the L-curve criterion. As for the TSVD regularized map, it is 
generally similar to the map shown in Figure 3, and still shows 
the comb-like structure near the Galactic Center which is 
produced by the truncation of certain modes. 
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Figure 8. The comparison between the input (blue) and the solved spherical harmonics coefficients obtained using the Tikhonov regularization, with the regularization 
parameter determined by the GCV (green) and L-curve (orange) methods, for the m = 10 (left) and m = 200 (right) modes, in the case of 30 days noise level. The top 
sub-figure of each column shows the real part, and the bottom one gives the imaginary part. The bottom panel of each sub-figure shows the residue between the input 


and the solution. 


3.4. An Overall Regularization Parameter 


As matrix B in Equation (8) is block diagonal, the m-mode 
method takes advantage of this structure by processing each m- 
block individually. To produce the regularized maps, we have 
also optimized the choice of regularization parameter for each m 
separately, using the GCV and L-curve criteria. For simplicity, it 
is also possible to adopt a single overall regularization parameter 
A considering all m-blocks. We illustrate the L-curve criterion 
for Tikhonov regularization here. Then the point on the L-curve 
associated with the regularization parameter À is give by 


M max 


(Eisa -mlo Y lanl} 
m=0 


m=0 


In Figure 11, we show the relevant L-curve for two cases 
with different noise level. For the 1 day case, the noise level is 
relatively high, and the L-curve criterion yields a relatively 
large optimal regularization parameter value À = 2.31 x 107°, 
while for the 30 days case, the level of noise is significantly 
lower, leading to a correspondingly smaller optimal regulariza- 
tion parameter value À = 5.34 x 10-4. 

In Figure 12 we illustrate the reconstructed map with these 
regularization parameters (the visibility data for imaging has 
the corresponding level of noise). In the 1 day case, due to the 
relatively large noise and regularization parameter, the map 
shows more deviation than the 30 days map, and the side lobe 
feature near the north celestial pole is quite obvious. However, 
we can see that for the 30 days noise level map, the deviation is 
also quite significant if we compare it to the maps made by the 
m-by-m mode analysis shown in Figure 10. This is not 
surprising, since a single \ value may not be suitable for all 
different m-modes. For those less sensitive modes that are 
susceptible, this value of is perhaps too small to regularize 
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Figure 9. The truncation parameter k for TSVD regularization obtained by the 
L-curve criteria for data with the 30 days integration noise. For comparison, the 
case for fixed € = 107° is also plotted. 


the problem. On the other hand, for the modes to which the 
interferometer is sensitive, it may suppress too much to 
reconstruct the true signal. 


4. Discussions 


Since the regularized solution is only an approximation of 
the true value, it inevitably introduces bias. In Figure 13, we 
show the fractional difference between the Tikhonov regular- 
ized map from the noise-free data and the input map, which 
gives the bias from regularization. We can see that the major 
bias is around the bright point sources and is located at the 
region near the horizon. The bias around the bright point 
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Figure 10. The reconstructed regularized maps (top) and their fractional error (bottom) using automatically determined regularization parameters. Left: Tikhonov 
regularization with A from the GCV method; Middle: Tikhonov regularization with A from the L-curve method; Right: TSVD regularization with threshold given by 
the L-curve method, all for the data with 30 days integration noise level. 
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Figure 11. The overall (summing all m-modes) L-curve of Tikhonov regularization for the data containing 1 day (left) and 30 days integration noise (right), and the 
red star marks the regularization parameter determined from each curve. 


Figure 12. The Tikhonov regularized maps made with the optimal A determined from the overall L-curve for the data with 1 day (left) and 30 dayss noise (right). A 
larger regularization parameter (left) may result in an alias of the Galactic Center obviously in the map. 
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Figure 13. The fractional error caused by regularization in the Tikhonov 
regularized maps for the 30 days noise level data, where the regularization 
parameter is determined by the L-curve for each m. The overall effect of the 
additional noise error is illustrated in Figure 10. 


sources is from their convolution with the point-spread 
function, which is expect to be mitigated through further 
deconvolution techniques such as the CLEAN algorithm. 

We can also quantify the quality of the reconstructed map by 
using the angular cross power spectrum between the input map 
and the reconstructed maps. In the cross power spectrum, 
defined as 


, 1 : 
amah) = Yo a 


2l F 1 m=—l 


re 7 in* 
Im“Im > 


(oes = ( 


the noise from the maps being crossed are uncorrelated, so they 
do not contribute to the cross power. 
In Figure 14, we show the cross-correlation coefficients, i.e., 


gu / crc, the ratio of the cross-correlation power 
between the reconstructed and input maps to the square root of 
the product of the corresponding auto power spectra. For 
the Tikhonov regularization, we show the cases where the 
regularization parameter is determined m-by-m using the GCV 
and L-curve criteria, and for the TSVD regularization, the 
truncation threshold is determined based on the L-curve criterion. 

In all cases, the correlation falls below 1 as / increases, which 
means that the correlation is not perfect due to reconstruction 
error. There is a general trend which all three curves follow. 
The result obtained from TSVD regularization has a correlation 
coefficient close to that of the Tikhonov regularization with the 
A determined by the L-curve at / < 470, but beyond this scale 
there is a significant deterioration. The result of the Tikhonov 
regularization with A determined by the GCV criterion exhibits 
higher correlation in the range 7<150, but shows lower 
correlation in the range 200 < / < 550 than that of the L-curve 
criterion, where the noise is prone to be amplified. However, 
despite comparable performance as measured by the cross- 
correlation, the Tikhonov regularization produced better visual 
impression. 
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Figure 14. The cross-correlation coefficientsC/*-" / cc} for the Tikhonov 
regularized maps with the parameter determined using the GCV and L-curve 
criteria, and the TSVD regularized maps with the parameter determined using 
the L-curve criterion. All angular power spectra are computed for the 
observable part of sky (6 = — 45°) and with the four bright point sources 
subtracted. 


5. Conclusion 


In this paper we investigated the regularization methods 
applied to the sky map reconstruction from the radio 
interferometer data taken by the Tianlai cylinder pathfinder 
array. Due to the incomplete uv-coverage from the limited 
number of baselines, regularization is generally necessary in 
the reconstruction of sky from interferometer data, but the 
strategy and method adopted differ case by case. In our 
previous paper (Yu et al. 2023), we have investigated the map 
reconstruction for the Tianlai cylinder array, with emphasis on 
assessing the impact of array calibration error and noise on the 
final map, there we have only used the Moore—Penrose pseudo- 
inverse method in the map reconstruction. However, there are 
various different regularization methods, which also affect the 
map-making result. The exploration of the different regulariza- 
tion methods is the subject of the present paper. 

In this work we have studied the TSVD regularization and 
the Tikhonov regularization applied to our map reconstruction. 
The TSVD regularization is in some sense similar or equivalent 
to the Moore-Penrose pseudo-inverse used in paper I, which 
removes the modes susceptible to noise. The Tikhonov 
regularization, on the other hand, suppresses modes gradually, 
those susceptible to noise are more suppressed but not 
completely removed, consequently yielding a more smoothly 
regularized map. This smooth approach by the Tikhonov 
regularization can avoid the generation of obvious artefacts by 
the sharp cut off used in the TSVD regularization, producing 
visually better maps, though it is not necessarily more accurate 
than the TSVD regularization in a quantitative sense. 
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To obtain a high-fidelity sky map with the regularization 
techniques, it is crucial not to over-regularize the data by using 
an excessively large regularization parameter. However, the map 
may be greatly affected by the noise if a too small regularization 
parameter is chosen. We have applied the GCV and L-curve 
methods to determine the optimal regularization parameter. We 
find that both methods do produce generally good maps for a 
reasonable level of noise. In our case the L-curve criterion 
provides a more stable regularization parameter, which also 
results in a map with better visual impression. However, we note 
that this result is specific to the case we investigated. For 
different data sets, noise levels, etc., the result can be different. 
Furthermore, although these methods can be used to optimize 
parameter selection, they are still based on simple reasoning. To 
achieve results that better meet specific requirements, additional 
tuning may be needed. 
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